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- Abstract 



A simple method for the parallelization of extensive air shower simulations is described. A shower is simulated at fixed steps in 
altitude. At each step, daughter particles below a specified energy threshold are siphoned off and tabulated for further simulation. 
Once the entire shower has been tabulated, the resulting list of particles is concatenated and divided into separate list files where 
each possesses a similar projected computation time. These lists are then placed on a computation cluster where the simulation can 
be completed in a piecemeal fashion as computing resources become available. Once the simulation is complete, the outputs are 
reassembled as a complete air shower simulation. The original simulation program (in this case CORSIKA) is in no way altered 
for this procedure. Verification is obtained by comparisons of 10'^^ eV showers produced with and without parallelization. 
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y. Introduction 

In the past 50 years, much progress has been made in 
the understanding of Extensive Air Showers (EAS) associated 
with Ultra-High Energy Cosmic Rays (UHECRs). However, 
the historical difference in enerey determination between Sur- 
face Detection (SD) fl] H H fl and Fluorescence Detection 
,(FD) L5]|6]|7] has yet to be resolved. In its hybrid mode, the 
■Pierre Auger experiment ^ reports a 30% discrepancy for sim- 
ulation based energy determination between SD and FD for 
events observed in hybrid operation mode [9]. 
\ We posit that this discrepancy could be better understood if 
it were not for the fact that it has been computationally infea- 
sible to simulate large numbers of EAS with primary energy 
> 10'^ eV without utilizing statistical thinning methods that 
fully simulate only a small representative fraction of the EAS. 
These thinned simulations certainly can be adequate for cal- 
culating longitudinal profiles ifioll and average lateral distribu- 
tions III III . However, they neither capture the full breadth of 
fluctuations at the distance scale of individual surface detector 
counters nor do they provide all of the specific particle informa- 
tion necessary to properly estimate counter energy deposition 
and the consequent electronic response. 

Non-thinned simulations of UHECRs are very computation- 
ally intensive. Using a single modern CPU core, the simula- 
tion of the EAS for a single 10'^ eV proton requires on the 
order of 1 day. Simulation times increase more or less linearly 
with energy. Extrapolated to the logical extreme, this implies 
that, without continued progress in computational ability, the 
simulation of the largest UHECR observation reported thus far 
(3.2 X 10^" eV |5]) could span the better part of a decade. 
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One solution for this computational deficiency is paralleliza- 
tion. By dividing the task between many different CPU cores, a 
simulation that previously would have taken years to conclude 
can be completed in days or even hours. By employing scor- 
ing strategies to optimize the division of labor, it is possible to 
simulate even the largest observed EAS with nothing more than 
the spare computational power that inadvertently arises in the 
scheduling of large jobs on computational clusters. 

This paper is the first of three to describe methods used in 
the Telescope Array (TA) Collaboration [Il2i Il3ll to simulate 
EAS as seen by the TA surface detector (TASD). The second 
paper will deal with "dethinning," that is, replacing the shower 
particles eliminated in the thinning process. The third paper 
will describe the simulation of the TASD response to EAS and 
show comparisons between the actual TASD data and a spectral 
set of simulated EAS. 

2. Parallelization Overview 

Simulation programs for EAS are particularly well-suited 
for parallelization because they do not contain self-interaction. 
That is, the individual daughter particles in the shower interact 
exclusively with atmospheric medium and not with each other. 
Thus, each EAS can be thought of as the superposition of many 
smaller sub-showers. Parallelization can then be carried out in 
the following steps: 

1 . Initially, a single computer is utilized to separate the EAS 
simulation into many smaller, more manageable, simula- 
tions by running the simulation repeatedly through small 
steps in atmospheric depth. 

2. At each step, the simulation output is sorted with parti- 
cles above a nominal upper threshold being passed back 
through the simulation. Particles below a lower variable 
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threshold are discarded and the rest of the output is ap- 
pended to a master Hst. 

3. Eventually, all of the simulated particles fall below the 
nominal upper threshold. The master list contains all the 
input parameter sets necessary for a series of simulations 
that can be superimposed to reconstitute the the original 
EAS. 

4. The master list is then divided into sub-hsts and divided 
amongst a larger number of computers either manually or 
via clustering. 

5. When all of the sub-list simulations are finished, the final 
total simulation can then be reassembled. 

A critical aspect of this procedure is that the actual simulation 
source code is in no way altered. All aspects of parallelization 
are achieved by translating each generation of simulation out- 
put files into the next generation of simulation input files via 
a series of scripts and compiled programs under the direction 
of a master script which explicitly tracks spatial and temporal 
information for each component simulation. 

3. Parallelization Application 

While this is, in principle, a fairly straightforward procedure, 
it is important to consider how this procedure works in practice 
for a particular simulation package. For this purpose, we use 
CORSIKA v6.960 |14j. Hi gh e nergy hadronic interactions are 
modeled by QGSJET-II-03 OlSll . low energy hadronic interac- 
tions are modeled by FLUKA2008.3c [ 16, 17[], and electromag- 
netic interactions are modeled by EGS4 

As a first step, a CORSIKA "input card" is created for the 
primary particle. While this input card can encompass a wide 
range of CORSIKA options, there are some important con- 
straints: 

1 . There is only one EAS in the simulation run. 

2. The starting point of the arrival time scale is set to the first 
interaction. This greatly simplifies tracking time offsets 
later in the procedure 

3. The zenith angle of the primary particle does not exceed 
60°. This constraint is necessary in order to keep the 
primary zenith angles of the secondary EAS below 70°. 
Above 70°, CORSIKA uses a curved earth model which 
greatly complicates the propagation of temporal and spa- 
tial offsets through the parallelization. 

4. The observation level is set high in the atmosphere (e.g. 
80 km). By doing so, we minimize the simulation time for 
the linear stage of the procedure. 

5. The ground level core location is set to have the same x-y 
coordinate system as the EAS. 

Once an input card is created, it is submitted to the COR- 
SIKA simulation package. The particles in the simulated out- 
put are then each assigned a score, T, which corresponds to the 
maximum time necessary to simulate a complete EAS down to 
the final observation level for that particle. The value of T, was 
determined using the following relation: 

Ti oc Ej X e'-'\ (1) 



where £, is the energy of the particle, f is the atmosphere slant 
depth between the position of the particle and the ground, and 
T = 50 g/cm^. While this is a crude estimate of simulation time 
at best, it is fairly accurate insofar as the EAS maximum does 
not occur too far above the ground. 

The particles are then divided into two categories: T, > T,„ax 
and r, < r„,f„, where T„,„.( is the maximum time to completion 
for individual jobs in parallel portion of this process. Particles 
where T, < T„,ax are added to a master list. Each line of the 
list contains: T,, particle type, energy, trajectory, position, time, 
and random number seeds. 

For particles where T, > T„,ax, the time and position are noted 
in a separate file and a new CORSIKA input card is generated. 
The new card has an observation level 50 g/cm- lower in slant 
depth than the starting altitude. The resulting set of input cards 
is then submitted to the CORSIKA package for simulation. The 
resulting particles have their times and positions offset by the 
initial values recorded above. This process undergoes many 
iterations until there are no particles where T, > T^ax- 

Once the shower is decomposed into particles where Tj < 
T„,ax, a cut is applied on particles where Ej < tyj*{ 1 MeV-cm^/g) 
with f,,,- being the vertical depth between the particle and the 
ground level. This eliminates particles which are sufficiently 
low in energy that subsequent propagation would not be ex- 
pected to persist to ground level due to either ionization and/or 
further EAS production. The resulting list of particles is then 
sorted into sub-lists where X/ Ti - T„ax- 

The simulation can now be carried out in a parallel fashion. 
For each sub-list, CORSIKA input cards are created for each 
individual particle. This differs from the non-parallel portion 
above in that every EAS is simulated all the way to ground 
level. Once the EAS are simulated for every particle, the time 
and position offsets for each particle is applied to the respective 
CORSIKA output file and then all the EAS from the sub-list are 
concatenated into a single CORSIKA output file. From there, 
the final step is to further concatenate the concatenated sub-list 
outputs from the parallel jobs into a single CORSIKA output 
file. 

The predominate advantage of this procedure lies in its in- 
herent flexibility. Because each sub-list is independent in its 
execution, the simulation can proceed on whatever number of 
computational nodes is available and even on multiple systems. 
Because the size of each sub-list can be controlled by simply 
increasing or decreasing T,,,,, , it is very simple to make use of 
whatever excess capacity might be available due to scheduling 
gaps in a large computational cluster Furthermore, it would be 
trivial to adapt this method to volunteer computing. 

4. Parallelization Validation 

The parallelization method is validated by comparing pairs 
of EAS simulated with the same input parameters both with 
and without parallelization. For comparison purposes, we initi- 
ated with a 10'^^ eV proton at a fixed height of first interaction 
of 30 km. This comparatively low primary energy was chosen 
due to the time required to generate EAS without paralleliza- 
tion. Four difl'erent primary zenith angles were selected: 0°, 
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Figure 1 : Comparison of secondary particle spectra for two protonic sliowers (one generated witli a single CPU and one generated in parallel with many CPUs) 
with primary energy Eq = lO'*'^ eV and primary zenith angle 0d = 0° . For each shower, simulated particles whose ground position was within a region enclosed 
by shower rotation angles <J> = [-30°, 30°] and lateral distances r = [500m, 1000m] were tabulated with respect to particle type, incident angle with respect to the 
ground, 6,, and kinetic energy. The resulting spectra are shown in cos 6i = 0.1 increment bins for a) photons, b) electrons and positrons, and c) muons. For each 
histogram, good agreement is observed between simulations generated Unearly (black) and via paraUeUzation (gray). 
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Figure 2: Comparison of secondary particle spectra for two protonic sliowers (one generated witli a single CPU and one generated in parallel with many CPUs) 
with primary energy Ea = lO"*'^ eV and primary zenith angle So = 30° . For each shower, simulated particles whose ground position was within a region enclosed 
by shower rotation angles <J> = [-30°, 30°] and lateral distances r = [500m, 1000m] were tabulated with respect to particle type, incident angle with respect to the 
ground, 6,, and kinetic energy. The resulting spectra are shown in cos 6i = 0.1 increment bins for a) photons, b) electrons and positrons, and c) muons. For each 
histogram, good agreement is observed between simulations generated linearly (black) and via parallelization (gray). 
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Figure 3: Comparison of secondary particle spectra for two protonic sliowers (one generated witli a single CPU and one generated in parallel with many CPUs) 
with primary energy Ea = lO"*'^ eV and primary zenith angle So = 45° . For each shower, simulated particles whose ground position was within a region enclosed 
by shower rotation angles <J> = [-30°, 30°] and lateral distances r = [500m, 1000m] were tabulated with respect to particle type, incident angle with respect to the 
ground, 6,, and kinetic energy. The resulting spectra are shown in cos 6i = 0.1 increment bins for a) photons, b) electrons and positrons, and c) muons. For each 
histogram, good agreement is observed between simulations generated linearly (black) and via parallelization (gray). 
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Figure 4: Comparison of secondary particle spectra for two protonic EAS simulations (one generated with a single CPU and one generated in parallel with many 
CPUs) with primary energy i?o = lO'*'' eV and primary zenith angle 6q = 60° . For each shower, simulated particles whose ground position was within a region 
enclosed by shower rotation angles <D = [-30°, 30°] and lateral distances r = [500m, 1000m] were tabulated with respect to particle type, incident angle with respect 
to the ground, 0;, and kinetic energy. The resulting spectra are shown in cos 6i = 0.1 increment bins for a) photons, b) electrons and positrons, and c) muons. For 
each histogram, good agreement is observed between simulations generated linearly (black) and via paralleUzation (gray). 
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30°, 45°, and 60°. In Figures[T]|4] we show comparisons of sec- 
ondary spectra from CORSIKA EAS simulated with and with- 
out paralleHzation. While the same random number seeds were 
used for the single-process simulations and the top level of the 
parallelized simulations, this can only guarantee that the inter- 
actions down to the first observation level of the paralleHzation 
scheme (in this case 29 km) will be the same for the two simu- 
lations. Taking this built-in discrepancy into account, the simu- 
lations with and without paralleHzation agree remarkably well 
for Figures [T]|4] 

Another comparison that can be performed is to consider sec- 
ondary particle arrival times for specific points at the ground 
observation level. For this purpose, we consider a ring in the 
plane normal to the EAS trajectory that intersects the ground at 
the shower core with a radius of 300 m and 2 m thickness. The 
ring is then further divided azimuthally into ~ 1000 2-meter 
long segments. These segments are then projected onto the 
ground. For each segment, we tabulate the arrival time, fo, of 
the first particle for each segment and the time, fi/2, when 50% 
of the total particle flux for a given segment has arrived. These 
times are then corrected for the time offset between the posi- 
tions of each segment on the ground and the plane normal to 
the EAS. In Figure |5] we see comparative histograms of fo for 
all four simulation pairs described above. Figure |6] shows the 
same histogram comparisons for fi/2. For both the fo and fi/2 
comparisons, we see good agreement. 

5. Conclusion 

Our paralleHzation technique yields results that are com- 
pletely equivalent to the conventional linear method. The tech- 
nique's advantages include: 

1. No modification to the underlying simulation routines is 
necessary for this method. In the case of CORSIKA, 
scripts and binaries were used that were entkely external 
to CORSIKA itself. 

2. This technique is highly scalable. Simulations have been 
successfully executed on systems with more than 3,000 
concurrent computational cores. 

3. There is also a great deal of flexibility. Because each job 
within a simulation is functionally independent and the du- 
ration of each job can be specified, it's possible to utilize a 
wide range of different computational resources. 

There is one major caveat. While simulations that would 
have previously taken thousands of hours to complete can now 
be divided into thousands of jobs that each take hours to com- 
plete, the net use of computational resources is conserved. With 
our current resources, while it is now feasible to simulate 1 00 
showers with primary energies above 10'^ eV, it is not feasible 
to simulate the tens of thousands of showers necessary for a suf- 
ficient study of detector response and acceptance for a UHECR 
surface array. ParalleHzation does, however, provide the means 
to acquire a large reference library of EAS simulations for the 
purpose of developing further techniques. 
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Figure 5: Comparison of distribution of initial arrival times, to, for 2 X 2 m segments in plane normal to shower trajectory for lO"' ' eV protonic EAS simulations 
with primary zenith angles of a) 0°, b) 30°, c) 45°, and d) 60°. In each case, to was measured for segments 200 m lateral distance from the shower core. For each 
histogram, good agreement in both mean value and variance are observed between simulations generated linearly (black) and via paralleUzation (gray). 
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Figure 6: Comparison of distribution of median arrival times, fi/2, for 2 X 2 m segments in plane normal to shower trajectory for lO'*^ eV protonic EAS simulations 
with primary zenith angles of a) 0°, b) 30°, c) 45°, and d) 60°. In each case, ti/2 was measured for segments 200 m lateral distance from the shower core. For each 
histogram, good agreement in both mean value and variance are observed between simulations generated linearly (black) and via paralleUzation (gray). 
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